我们将零温度的大都市蒙特卡洛算法作为通过最大程度地减少损失函数来训练神经网络的工具。我们发现,正如理论上的预期,并在其他作者的经验上表现出来,Metropolis Monte Carlo可以训练具有与梯度下降相当的准确性(即使不一定那么快)的准确性。当神经网络的参数数量较大时,大都市算法不会自动失败。当神经网络的结构或神经元激活是强大的异质性时,它可能会失败,并且我们引入了一种自适应的蒙特卡洛算法AMC来克服这些局限性。 Monte Carlo方法的内在随机性和数值稳定性使AMC可以训练深层神经网络和经常性的神经网络,其中梯度太小或太大,无法通过梯度下降进行训练。 Monte Carlo方法为培训神经网络的基于梯度的方法提供了补充,从而可以访问一组不同的网络架构和原理。
translated by 谷歌翻译
我们表明,细胞自动机可以通过诱导动态相共存形式来对数据进行分类。我们使用蒙特卡洛方法搜索一般的二维确定性自动机,该自动机根据活动对图像进行分类,即从图像引发的轨迹中发生的状态变化的数量。当自动机的时间段数量是可训练的参数时,搜索方案确定了自动机,该自动机会根据初始条件,生成一个动态轨迹群体显示出较高或低活动的动态轨迹。这种性质的自动机的表现为非线性激活功能,其输出有效二进制,类似于尖峰神经元的新兴版本。
translated by 谷歌翻译
使用玩具航海导航环境,我们表明,只有已知有关部分观察到的马尔可夫决策过程(POMDP)的部分信息,可以使用动态编程。通过将不确定性纳入我们的模型,我们表明可以构建维护安全的导航策略。添加受控感测方法,我们表明这些策略同时也可以降低测量成本。
translated by 谷歌翻译
在科学应用中使用强化学习(RL),如材料设计和自动化学,正在增加。然而,一个主要挑战实际上,测量系统的状态通常在科学应用中昂贵且耗时,而使用RL的策略学习需要在每次步骤之后进行测量。在这项工作中,我们将测量成本以耗旧奖励的形式明确,并提出了一个框架,使得能够从架子的深rl算法中学习选择操作和确定是否测量当前状态的策略每个时间步骤的系统。通过这种方式,该代理商学会与信息成本相比平衡信息。我们的研究结果表明,当在该制度下培训时,Dueling DQN和PPO代理商可以学习最佳的行动政策,同时制作多达50 \%的状态测量,并且经常性的神经网络可以在测量中产生大于50±50%。我们假设这些减少可以帮助降低屏障将RL应用于现实世界的科学应用。
translated by 谷歌翻译
使用模型热引擎,我们表明基于神经网络的增强学习可以识别最大效率的热力学轨迹。我们考虑梯度和渐变的加强学习。我们使用进化学习算法来发展神经网络的群体,受指令以最大化由一组基本热力学过程组成的轨迹的效率;由此产生的网络学习进行最大高效的克罗特,斯特林或奥托周期。当给出额外的不可逆转过程时,这种进化方案学习先前未知的热力学循环。基于梯度的强化学习能够学习斯特林循环,而进化方法能够实现最佳的圆形循环。我们的结果展示了如何应用为游戏播放开发的增强学习策略来解决在路径广泛的订单参数上调节身体问题。
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
Importance: The prevalence of severe mental illnesses (SMIs) in the United States is approximately 3% of the whole population. The ability to conduct risk screening of SMIs at large scale could inform early prevention and treatment. Objective: A scalable machine learning based tool was developed to conduct population-level risk screening for SMIs, including schizophrenia, schizoaffective disorders, psychosis, and bipolar disorders,using 1) healthcare insurance claims and 2) electronic health records (EHRs). Design, setting and participants: Data from beneficiaries from a nationwide commercial healthcare insurer with 77.4 million members and data from patients from EHRs from eight academic hospitals based in the U.S. were used. First, the predictive models were constructed and tested using data in case-control cohorts from insurance claims or EHR data. Second, performance of the predictive models across data sources were analyzed. Third, as an illustrative application, the models were further trained to predict risks of SMIs among 18-year old young adults and individuals with substance associated conditions. Main outcomes and measures: Machine learning-based predictive models for SMIs in the general population were built based on insurance claims and EHR.
translated by 谷歌翻译
In this paper, we consider the problem of adjusting the exploration rate when using value-of-information-based exploration. We do this by converting the value-of-information optimization into a problem of finding equilibria of a flow for a changing exploration rate. We then develop an efficient path-following scheme for converging to these equilibria and hence uncovering optimal action-selection policies. Under this scheme, the exploration rate is automatically adapted according to the agent's experiences. Global convergence is theoretically assured. We first evaluate our exploration-rate adaptation on the Nintendo GameBoy games Centipede and Millipede. We demonstrate aspects of the search process. We show that our approach yields better policies in fewer episodes than conventional search strategies relying on heuristic, annealing-based exploration-rate adjustments. We then illustrate that these trends hold for deep, value-of-information-based agents that learn to play ten simple games and over forty more complicated games for the Nintendo GameBoy system. Performance either near or well above the level of human play is observed.
translated by 谷歌翻译
Heterogeneous treatment effects (HTEs) are commonly identified during randomized controlled trials (RCTs). Identifying subgroups of patients with similar treatment effects is of high interest in clinical research to advance precision medicine. Often, multiple clinical outcomes are measured during an RCT, each having a potentially heterogeneous effect. Recently there has been high interest in identifying subgroups from HTEs, however, there has been less focus on developing tools in settings where there are multiple outcomes. In this work, we propose a framework for partitioning the covariate space to identify subgroups across multiple outcomes based on the joint CIs. We test our algorithm on synthetic and semi-synthetic data where there are two outcomes, and demonstrate that our algorithm is able to capture the HTE in both outcomes simultaneously.
translated by 谷歌翻译
Data-driven modeling has become a key building block in computational science and engineering. However, data that are available in science and engineering are typically scarce, often polluted with noise and affected by measurement errors and other perturbations, which makes learning the dynamics of systems challenging. In this work, we propose to combine data-driven modeling via operator inference with the dynamic training via roll outs of neural ordinary differential equations. Operator inference with roll outs inherits interpretability, scalability, and structure preservation of traditional operator inference while leveraging the dynamic training via roll outs over multiple time steps to increase stability and robustness for learning from low-quality and noisy data. Numerical experiments with data describing shallow water waves and surface quasi-geostrophic dynamics demonstrate that operator inference with roll outs provides predictive models from training trajectories even if data are sampled sparsely in time and polluted with noise of up to 10%.
translated by 谷歌翻译